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Unsupervised clustering can be significantly improved using supervision in the form of 
pairwise constraints, i.e., pairs of instances labeled as belonging to same or different 
clusters. In recent years, a number of algorithms have been proposed for enhancing 
clustering quality by employing such supervision. Such methods use the constraints to either 
modify the objective function, or to learn the distance measure. We propose a probabilistic 
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Clustering algorithms have become increasingly important in handling and analyzing data. 
Considerable work has been done in devising effective but increasingly specific clustering 
algorithms. In contrast, we have developed a generalized framework that accommodates 
diverse clustering algorithms in a systematic way. This framework views clustering as a 
general process of iterative optimization that includes modules for supervised learning and 
instance assignment. The framework has also suggested s ... 
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huge Inverted File indexes. Inverted File indexes allow fast query resolution and good 
memory utilization since their d-gaps representation can be effectively and efficiently 
compressed by using variable length encoding methods. This paper proposes and evaluates 
some algorithms aimed to find an assignment of the document identifiers which minimizes 
the average values of d-gaps, thus enhanc ... 
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Clustering is the unsupervised classification of patterns (observations, data items, or feature 
vectors) into groups (clusters). The clustering problem has been addressed in many 
contexts and by researchers in many disciplines; this reflects its broad appeal and 
usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult 
problem combinatorially, and differences in assumptions and contexts in different 
communities has made the transfer of useful generic co ... 

Keywords: cluster analysis, clustering applications, exploratory data analysis, incremental 
clustering, similarity indices, unsupervised learning 
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'' terms 

The Johnson—Lindenstrauss lemma states that n points in a high-dimensional Hilbert space 
can be embedded with small distortion of the distances into an 0(log n) dimensional space 
by applying a random linear transformation. We show that similar (though weaker) 
properties hold for certain random linear transformations over the Hamming cube. We use 
these transformations to solve NP-hard clustering problems in the cube as well as in 
geometric settings. More specifically, ... 

Keywords: Clustering, high-dimensional data, polynomial-time approximation schemes 
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This paper compares text retrieval methods intended for office systems. The operational 
requirements of the office environment are discussed, and retrieval methods from database 
systems and from information retrieval systems are examined. We classify these methods 
and examine the most interesting representatives of each class. Attempts to speed up 
retrieval with special purpose hardware are also presented, and issues such as approximate 
string matching and compression are discussed. A quali ... 
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Process migration is the act of transferring a process between two machines. It enables 
dynamic load distribution, fault resilience, eased system administration, and data access 
locality. Despite these goals and ongoing research efforts, migration has not achieved 
widespread use. With the increasing deployment of distributed systems in general, and 
distributed operating systems in particular, process migration is again receiving more 
attention in both research and product development. As hi ... 

Keywords: distributed operating systems, distributed systems, load distribution, process 
migration 
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Full text available: TBI pdf(220.93 KB) 
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Determining the structure of biological macromolecules such as proteins and nucleic acids is 
an important element of molecular biology because of the intimate relation between form 
and function of these molecules. Individual sources of data about molecular structure are 
subject to varying degrees of uncertainty. Previously we have examined the parallelization 
of a probabilistic algorithm for combining multiple sources of uncertain data to estimate the 
three-dimensional structure of molecule ... 
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June 1993 ACM Computing Surveys (CSUR), Volume 25 issue 2 
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Full text available: TO pdf(9.37 MB) * ' 
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Database management systems will continue to manage large data volumes. Thus, efficient 
algorithms for accessing and manipulating large sets and sequences will be required to 
provide acceptable performance. The advent of object-oriented and extensible database 
systems will not solve this problem. On the contrary, modern data models exacerbate the 
problem: In order to manipulate large sets of complex objects as efficiently as today's 
database systems manipulate simple records, query-processi ... 

Keywords: complex query evaluation plans, dynamic query evaluation plans, extensible 
database systems, iterators, object-oriented database systems, operator model of 
parallelization, parallel algorithms, relational database systems, set-matching algorithms, 
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Full text available: ^ pdf(960.66 KB) Additional Information: full citation , abstract , references 

The clustered parallel computer (CPC), based on a workstation cluster, is becoming popular 
as a choice for high-performance network or parallel computing. However, operating system 
overheads, network protocols, and higher message-passing latency contribute to a lower 
overall communication performance in a cluster of workstations, increasing the likelihood 
that timesharing of parallel jobs can be used to improve system throughput in a workstation 
cluster. The traditional means by which the ... 
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Felisa J. Vazquez-abad, Lachlan L. H. Andrew, David Everitt 
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January 2002 ACM Transactions on Modeling and Computer Simulation (TOMACS), 

Volume 12 Issue 1 

Full text available: ^ odf(385.62 KB) Additional Information: full citation , abstract , references , index terms 

Blocking probabilities in cellular mobile communication networks using dynamic channel 
assignment are hard to compute for realistic sized systems. This computational difficulty is 
due to the structure of the state space, which imposes strong coupling constraints amongst 
components of the occupancy vector. Approximate tractable models have been proposed, 
which have product form stationary state distributions. However, for real channel 
assignment schemes, the product form is a poor approximation a ... 

Keywords: Blocking probability, cellular networks, importance sampling 



12 The state of the art in locally distributed Web-server systems 
Valeria Cardellini, Emiliano Casalicchio, Michele Colajanni, Philip S. Yu 
June 2002 ACM Computing Surveys (CSUR), Volume 34 issue 2 

Full text available: «odf(1.41 MB) Additional Information: full citation , abstract, references , citings, index 
l^H*- 4 terms 

The overall increase in traffic on the World Wide Web is augmenting user-perceived 
response times from popular Web sites, especially in conjunction with special events. 
System platforms that do not replicate information content cannot provide the needed 
scalability to handle large traffic volumes and to match rapid and dramatic changes in the 
number of clients. The need to improve the performance of Web-based services has 
produced a variety of novel content delivery architectures. This article w ... 

Keywords: Client/server, World Wide Web, cluster-based architectures, dispatching 
algorithms, distributed systems, load balancing, routing mechanisms 



13 Formulation and preliminary test of an empirical theory of coordination in software 
en gineering 

James D. Herbsleb, Audris Mockus 

September 2003 ACM SIGSOFT Software Engineering Notes , Proceedings of the 9th 

European software engineering conference held jointly with 11th ACM 
SIGSOFT international symposium on Foundations of software 
engineering, Volume 28 issue 5 

Full text available:^ pdf(283. 88 KB) Additional Information: full citation , abstract , references , index terms 

Motivated by evidence that coordination and dependencies among engineering decisions in a 
software project are key to better understanding and better methods of software creation, 
we set out to create empirically testable theory to characterize and make predictions about 
coordination of engineering decisions. We demonstrate that our theory is capable of 
expressing some of the main ideas about coordination in software engineering, such as 
Conway's law and the effects of information hiding in modu ... 

Keywords: Conway's Law, coordination, empirical studies, empirical theory, engineering 
decisions 



14 Multiview access protocols for large-scale replication 
Xiangning Liu, Abdelsalam Helal, Weimin Du 

June 1998 ACM Transactions on Database Systems (TODS), Volume 23 issue 2 

i- ii * u. -I u. 01 ji/oac no i/nv Additional Information: full citation , abstract , references , citings , index 

Full text available: 133 pdf(365.98 KB) 

^ terms , review 

The article proposes a scalable protocol for replication management in large-scale replicated 
systems. The protocol organizes sites and data replicas into a tree-structured, hierarchical 
cluster architecture. The basic idea of the protocol is to accomplish the complex task of 
updating replicated data with a very large number of replicas by a set of related but 
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independently committed transactions. Each transaction is responsible for updating replicas 
in exactly one cluster and invoking add ... 

Keywords: data replication, large-scale systems, multiview access 

15 Resource allocation scheme for QoS provisioning in microcellular networks carrying Q 

multimedia traffic 

Anna Hac, Abhinay Armstrong 

September 2001 International Journal of Network Management, Volume n issue 5 

Full text available: Q pdf(393.59 KB) Additional Information: full citation , abstract , references , index terms 

We propose a new resource allocation scheme based on the concept of resource reservation 
and resource renegotiation. The new scheme is aimed at improving performance with regard 
to new call blocking rate, handoff dropping rate, forced call termination rate, and average 
bandwidth use. We compare our scheme with other schemes. The performance is evaluated 
by using simulation. Copyright © 2001 John Wiley & Sons, Ltd. 

16 The pebble crurching model for load balancing in concurrent hypercube ensembles Q 
J. Barhen, S. Gulati, S. S. Iyengar 

January 1988 Proceedings of the third conference on Hypercube concurrent computers 
and applications: Architecture, software, computer systems, and general 
issues - Volume 1 

Full text available* IS odf(1 36 MB) Additional Information: full citation , abstract , references , citings , index 
' ^ terms 

The successful development of fifth generation systems require enormous computational 
capability and flexibility necessitating the ability to achieve operational responses in hard 
real-time through optimal resource utilization. This entails dynamically balancing the 
computational load among all the processing nodes in the system. We propose a graph- 
theoretic, receiver-initiated, distributed protocol for dynamic load balancing protocol in 
large-scale hypercube ensembles. Using attributed hyp ... 

17 Distributed file systems: concepts and examples Q 
Eliezer Levy, Abraham Silberschatz 

December 1990 ACM Computing Surveys (CSUR), volume 22 issue 4 

Full text available - 1 9 pdf(5.33 MB) Additional Information: full citation , abstract , references , citings , index 
^ terms , review 

The purpose of a distributed file system (DFS) is to allow users of physically distributed 
computers to share data and storage resources by using a common file system. A typical 
configuration for a DFS is a collection of workstations and mainframes connected by a local 
area network (LAN). A DFS is implemented as part of the operating system of each of the 
connected computers. This paper establishes a viewpoint that emphasizes the dispersed 
structure and decentralization of both data and con ... 



18 Exploiting Value Locality in Physical Register Files Q 
Saisanthosh Balakrishnan, Gurindar S. Sohi 

December 2003 Proceedings of the 36th Annual IEEE/ACM International Symposium on 
M i c roa rch itect u re 

Pdfd 94.25 KB) 

^Additional Information: full citation , abstract , index terms 
Publisher Site 

The physical register file is an important component of adynamically-scheduled processor. 
Increasing the amount of parallelismplaces increasing demands on the physical register 
file, calling for alternative file organization and management strategies.This paper considers 
the use of value locality to optimize theoperation of physical register files. We present 
empirical data showing that: (i) the value producedby an instruction is often the same as a 
value produced by anotherrecently executed instr ... 
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19 View management in multimedia databases 
K. Selguk Candan, Eric Lemar, V. S. Subrahmanian 

July 2000 The VLDB Journal — The International Journal on Very Large Data Bases, 

Volume 9 Issue 2 

Full text available: ^ pdf(322.82 KB) Additional Information: full citation, abstract , index terms 

Though there has been extensive work on multimedia databases in the last few years, there 
is no prevailing notion of a multimedia view, nor there are techniques to create, manage, 
and maintain such views. Visualizing the results of a dynamic multimedia query or 
materializing a dynamic multimedia view corresponds to assembling and delivering an 
interactive multimedia presentation in accordance with the visualization specifications. In 
this paper, we suggest that a non-interactive multimedia prese ... 

Keywords: Interactivity, Multimedia databases, Prefetching, Result 
visualization/presentation, View management 

20 System on chip design: An integrated algorithm for memory allocation and assignment Q 

in high-level synthesis 

Jaewon Seo, Taewhan Kim, Preeti R. Panda 

June 2002 Proceedings of the 39th conference on Design automation 

Full text available: ^ppdfd 34.60 KB) Additional Information: full citation , abstract , references , index terms 

With the increasing design complexity and performance requirement, data arrays in 
behavioral specification are usually mapped to fast on-chip memories in behavioral 
synthesis. This paper describes a new algorithm that overcomes two limitations of the 
previous works on the problem of memory-allocation and array-mapping to memories. 
Specifically, its key features are (1) a tight link to the scheduling effect, which was totally or 
partially ignored by the existing memory synthesis systems, a ... 

Keywords: memory allocation, memory assignemt, memory design, scheduling effect 
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21 A Multi-Agent Systems Approach to Autonomic Computing 
Gerald Tesauro, David M. Chess, William E. Walsh, Rajarshi Das, Alia Segal, Ian Whalley, 
Jeffrey O. Kephart, Steve R. White 

July 2004 Proceedings of the Third International Joint Conference on Autonomous 
Agents and Multiagent Systems - Volume 1 

Full text available: ^pdf(208.12 KB) Additional Information: full citation , abstract , index terms 

The goal of autonomic computing is to create computing systems capable of managing 
themselves to a far greater extent than they do today. This paper presents Unity, a 
decentralized architecture for autonomic computing based on multiple interacting agents 
called autonomic elements. We illustrate how the Unity architecture realizes a number of 
desired autonomic system behaviors including goal-driven self-assembly, self-healing, and 
real-time self-optimization. We then present a realistic prototype ... 

22 Provisioning algorithms for WDM optical networks 
Murat Alanyali, Ender Ayanoglu 

October 1999 IEEE/ ACM Transactions on Networking (TON), volume 7 issue 5 

Full text available: fjjjl pdf(289.62 KB) Additional Information: full citation, references , citings , index terms 



2 3 Research sessions: selectivity: Hierarchical subspace sampling: a unified framework Q 
for high dimensional data reduction, selectivity estimation and nearest neighbor search 
Charu C. Aggarwal 

June 2002 Proceedings of the 2002 ACM SIGMOD international conference on 
Management of data 

Full text available: 1 5S)pdf(1.40MB) Additional Information: full citation, abstract, references, citings, index 
' ^ terms 

With the increased abilities for automated data collection made possible by modern 
technology, the typical sizes of data collections have continued to grow in recent years. In 
such cases, it may be desirable to store the data in a reduced format in order to improve the 
storage, transfer time, and processing requirements on the data. One of the challenges of 
designing effective data compression techniques is to be able to preserve the ability to use 
the reduced format directly for a wide range of ... 

24 Reconfiguration of carrier assignment in cellular networks Q 
Angelos N. Rouskas, Michael G. Kazantzakis, Miltiades E. Anagnostou 

December 1999 Wireless Networks, volume 5 issue 6 

Full text available: 1jjj| pdf(241.51 KB) Additional Information: full citation , references , index terms 
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25 Dynamic channel assignment in wireless communication networks 
Anna Haj Had, Chunlei Haj Mo 

March 1999 International Journal of Network Management, Volume 9 issue l 

Full text available: ^pdf(359.91 KB) Additional Information: full citation , abstract , references , index terms 

We propose a new Cochannel information based Dynamic Channel Assignment 
&lpar;CDCA&rpar; strategy for small and microcell systems and a new Group Dynamic 
Channel Assignment &lpar;GDCA&rpar; strategy which handles multichannel traffic in 
wireless networks. Copyright © 1999 John Wiley & Sons, Ltd. 

26 Agent behavior and agent models in unregulated markets 
K. Smith, R. Paranjape, L. Benedicenti 

September 2001 ACM SIGAPP Applied Computing Review, Volume 9 issue 3 
Full text available: ^ pdf(936.06 KB) Additional Information: full citation , abstract , references 

Mobile-agent systems show significant promise as the most effective way to harness the 
power of the Internet and the massive collection of information and opportunity that the 
Internet holds. However the efficient organization and control of these systems remains one 
of a number of unsolved problems with this approach to network computing. This paper 
examines a mobile-agent system with specific focus on environment sensing, preemptive 
load balancing and open agent markets. Agent behaviour is stu ... 

Keywords: AR modeling, agent system modeling, environment sensing, load balancing, 
mobile agents 



27 MA-WATM: a new approach towards an adaptive wireless ATM network 
Khaldoun Al agha, Houda Labiod 

May 1999 Mobile Networks and Applications, Volume 4 issue 2 

Full text available: ^ pdf(1 76.03 KB) Additional Information: full citation , abstract , references , index terms 

In a cellular multimedia network like wireless ATM (WATM), self control seems primordial. 
Our new approach is based on the application of DAI (distributed artificial intelligence) 
techniques in order to build a self-adaptive network within random non-uniform traffic 
conditions. Attempting to achieve a high network capacity in terms of resource allocation 
and air interface BER (bit error rate), we propose to apply intelligent agent features to 
enhance the architecture of WATM systems. In fac ... 

28 Measurement: The impact of address allocation and routing on the structure and 
implementation of routing tables 

Harsha Narayan, Ramesh Govindan, George Varghese 

August 2003 Proceedings of the 2003 conference on Applications, technologies, 
architectures, and protocols for computer communications 

Full text available: ^ pdfd 48.92 KB) Additional Information: full citation , abstract , references , index terms 

The recent growth in the size of the routing table has led to an interest in quantitatively 
understanding both the causes (eg multihoming) as well as the effects (eg impact on router 
lookup implementations) of such routing table growth. In this paper, we describe a new 
model called ARAM that defines the structure of routing tables of any given size. Unlike 
simpler empirical models that work backwards from effects (eg current prefix length 
distributions), ARAM a ... 

Keywords: IP lookups, modeling, routing tables 
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Huo Yan Chen, T. H. Tse, T. Y. Chen 

January 2001 ACM Transactions on Software Engineering and Methodology (TOSEM), 

Volume 10 Issue 1 

r lu 4 , u. « ,xnon oc i^ox Additional Information: full citation , abstract , references , citings , index 

Full text available: TB pdf(289.85 KB) : 

[£=y ^ terms , review 

Object-oriented programming consists of several different levels of abstraction, namely, the 
algorithmic level, class level, cluster level, and system level. The testing of object-oriented 
software at the algorithmic and system levels is similar to conventional program testing. 
Testing at the class and cluster levels poses new challenges. Since methods and objects 
may interact with one another with unforeseen combinations and invocations, they are much 
more complex to simulate and test than ... 

Keywords: algebraic specifications, contact specifications, message passing, object- 
oriented programming, software testing 



30 Optimal partitioned and end-case placers for standard-cell layout 
A. E. Caldwell, A. B. Kahng, I. L. Markov 

April 1999 Proceedings of the 1999 international symposium on Physical design 

Full text available: ^ pdf(1.04 MB) Additional Information: full citation , references , citings , index terms 



31 Research track: Translation-invariant mixture models for curve clustering 
Darya Chudova, Scott Gaffney, Eric Mjolsness, Padhraic Smyth 

August 2003 Proceedings of the ninth ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available:^?) pdf(688.59 KB) Additional Information: full citation , abstract , references , index terms 

In this paper we present a family of algorithms that can simultaneously align and cluster 
sets of multidimensional curves defined on a discrete time grid. Our approach uses the 
Expectation-Maximization (EM) algorithm to recover both the mean curve shapes for each 
cluster, and the most likely shifts, offsets, and cluster memberships for each curve. We 
demonstrate how Bayesian estimation methods can improve the results for small sample 
sizes by enforcing smoothness in the cluster mean curves. We e ... 

Keywords: EM, alignment, curve clustering, mixture model, transformation invariance 



32 An energy-conscious algorithm for memory port allocation 
Preeti Ranjan Panda, Lakshmikantam Chitturi 

November 2002 Proceedings of the 2002 IEEE/ACM international conference on 
Computer-aided design 

Full text available: ^pdfd 11.21 KB) Additional Information: full citation , abstract , references , index terms 

Multiport memories are extensively used in modern system designs because of the 
performance advantages they offer. The increased memory access throughput could lead to 
significantly faster schedules in behavioral synthesis. However, they also have an associated 
area and energy penalty. We describe a technique for mapping data accesses to multiport 
memories during behavioral synthesis that results in significantly better energy 
characteristics than an unoptimized multiport design. The technique c ... 

33 Efficient parallel algorithms can be made robust 
P. C. Kanellakis, A. A. Shvartsman 

June 1989 Proceedings of the eighth annual ACM Symposium on Principles of 
distributed computing 

Full text available: ^pdf(1.17 MB) Additional Information: full citation , references , citings , index terms 
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34 Processor allocation policies for message-passing parallel computers | 
Cathy McCann, John Zahorjan 

May 1994 ACM SZGMETRICS Performance Evaluation Review , Proceedings of the 1994 

ACM SIGMETRICS conference on Measurement and modeling of computer 

systems, Volume 22 Issue 1 

r- .. . ^ l. 01 .jx/- Additional Information: full citation , abstract, references , citings, index 

Full text available: TOpdfd.SOMB) ^ 

terms 

When multiple jobs compete for processing resources on a parallel computer, the operating 
system kernel's processor allocation policy determines how many and which processors to 
allocate to each. In this paper we investigate the issues involved in constructing a processor 
allocation policy for large scale, message-passing parallel computers supporting a scientific 
workload. We make four specific contributions: We define the concept of efficiency 
preservat ... 

35 Fast detection of communication patterns in distributed executions | 
Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced Studies 
on Collaborative research 

Full text available: ^ pdf(4.21 MB) Additional Information: full citation , abstract , references , index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based on 
process-time diagrams are often used to obtain a better understanding of the execution of 
the application. The visualization tool we use is Poet, an event tracer developed at the 
University of Waterloo. However, these diagrams are often very complex and do not provide 
the user with the desired overview of the application. In our experience, such tools display 
repeated occurrences of non-trivial commun ... 

36 Learning II: A time series clustering based framework for multimedia mining and 
summarization using audio features 

Regunathan Radhakrishnan, Ajay Divakaran, Ziyou Xiong 

October 2004 Proceedings of the 6th ACM SIGMM international workshop on 
Multimedia information retrieval 

Full text available: ^ pdf(618.98 KB) Additional Information: full citation , abstract , references , index terms 

Past work on multimedia analysis has shown the utility of detecting specific temporal 
patterns for different content genres. In this paper, we propose a unified, content-adaptive, 
unsupervised mining framework to bring out such temporal patterns from different 
multimedia genres. We formulate the problem of pattern discovery from video as a time 
series clustering problem. We treat the sequence of low/mid level audio-visual features 
extracted from the video as a time series and perform a tempor ... 

Keywords: audio classification, time series analysis, video summarization 



37 Lexicon acquisition: The acquisition of lexical knowledge from combined machine- Q 

readable dictionary sources 
Antonio Sanfilippo, Victor Poznahski 

March 1992 Proceedings of the third conference on Applied natural language 
processing 

pdf(816.16KB) 

^Additional Information: full citation , abstract , references , citings 
Publisher Site 

This paper is concerned with the question of how to extract lexical knowledge from Machine- 
Readable Dictionaries (MRDs) within a lexical database which integrates a lexicon 
development environment. Our long term objective is the creation of a large lexical 
knowledge base using semiautomatic techniques to recover syntactic and semantic 
information from MRDs. In doing so, one finds that reliance on a single MRD source induces 
inadequacies which could be efficiently redressed through access to comb ... 
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38 Evaluation of Placement Techniques for DNA Probe Array Layout B 
Andrew B. Kahng, Ion Mandoiu, Sherief Reda, Xu Xu, Alex Z. Zelikovsky 
November 2003 Proceedings of the 2003 IEEE/ACM international conference on 
Computer-aided design 

Full text available: ^ pdf(200.42 KB) Additional Information: full citation , abstract , index terms 

DNA probe arrays have emerged as a core genomic technology thatenables cost-effective 
gene expression monitoring, mutation detection, single nucleotide polymorphism analysis 
and other genomicanalyses. DNA chips are manufactured through a highly scalableprocess, 
Very Large-Scale Immobilized Polymer Synthesis (VL-SIPS),that combines photolithographic 
technologies adapted fromthe semiconductor industry with combinatorial chemistry. 
Commerciallyavailable DNA chips contain more than a half millionprob ... 



39 Context-specific Bayesian clustering for gene expression data 
Yoseph Barash, Nir Friedman 

April 2001 Proceedings of the fifth annual international conference on Computational 
biology 

Full text available: ffipdf(233.32 KB) Additional Information: full citation , abstract, references , citings, index 

terms 

The recent growth in genomic data and measurement of genome-wide expression patterns 
allows to examine gene regulation by transcription factors using computational tools. In this 
work, we present a class of mathematical models that help in understanding the connections 
between transcription factors and functional classes of genes based on genetic and genomic 
data. These models represent the joint distribution of transcription factor binding sites and 
of expression levels of a gene in a single ... 

40 DB-1 (databases): data integration: Organizing structured web sources by query 

schemas: a clustering approach 

Bin He, Tao Tao, Kevin Chen-Chuan Chang 

November 2004 Proceedings of the Thirteenth ACM conference on Information and 
knowledge management 

Full text available: ^ pdf(323.72 KB) Additional Information: full citation , abstract , references , index terms 

In the recent years, the Web has been rapidly "deepened" with the prevalence of databases 
online. On this deep Web, many sources are <i>structured</i> by providing structured 
query interfaces and results. Organizing such structured sources into a domain hierarchy is 
one of the critical steps toward the integration of heterogeneous Web sources. We observe 
that, for structured Web sources, query schemas <i>ie</i>, attributes in query interfaces) 
are discriminative representative ... 

Keywords: data integration, deep Web, hierarchical agglomerative clustering 
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